skip to main content


Search for: All records

Creators/Authors contains: "Bichler, Sarah"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. null (Ed.)
  2. de Vries, E ; Hod, Y. ; Ahn, J. (Ed.)
    We report on design-based research to refine a professional development workshop that supports teachers to customize online curricula. We iteratively design representations to make the knowledge integration pedagogy of the curricula visible. We study ways to make the work of students using the curricula actionable for participating teachers. We analyze participants’ trajectories across the three iterations of the workshop. Initially, when participants realized they could customize the online curriculum, they developed feelings of ownership. Then, as participants deepened their understanding of the pedagogy, they began to use it to evaluate their own instruction. The trajectory culminated in participants connecting the pedagogy to student work from their own classroom. This led to a shift from focusing on remedies for misconceptions to seeking opportunities for building on students’ nascent ideas when customizing. The workshop refinements empowered teachers to mobilize the pedagogy to interpret their students' work to inform their customization decisions. 
    more » « less
  3. null (Ed.)
    Recent work on automated scoring of student responses in educational applications has shown gains in human-machine agreement from neural models, particularly recurrent neural networks (RNNs) and pre-trained transformer (PT) models. However, prior research has neglected investigating the reasons for improvement – in particular, whether models achieve gains for the “right” reasons. Through expert analysis of saliency maps, we analyze the extent to which models attribute importance to words and phrases in student responses that align with question rubrics. We focus on responses to questions that are embedded in science units for middle school students accessed via an online classroom system. RNN and PT models were trained to predict an ordinal score from each response’s text, and experts analyzed generated saliency maps for each response. Our analysis shows that RNN and PT-based models can produce substantially different saliency profiles while often predicting the same scores for the same student responses. While there is some indication that PT models are better able to avoid spurious correlations of high frequency words with scores, results indicate that both models focus on learning statistical correlations between scores and words and do not demonstrate an ability to learn key phrases or longer linguistic units corresponding to ideas, which are targeted by question rubrics. These results point to a need for models to better capture student ideas in educational applications. 
    more » « less
  4. With the widespread adoption of the Next Generation Science Standards (NGSS), science teachers and online learning environments face the challenge of evaluating students' integration of different dimensions of science learning. Recent advances in representation learning in natural language processing have proven effective across many natural language processing tasks, but a rigorous evaluation of the relative merits of these methods for scoring complex constructed response formative assessments has not previously been carried out. We present a detailed empirical investigation of feature-based, recurrent neural network, and pre-trained transformer models on scoring content in real-world formative assessment data. We demonstrate that recent neural methods can rival or exceed the performance of feature-based methods. We also provide evidence that different classes of neural models take advantage of different learning cues, and pre-trained transformer models may be more robust to spurious, dataset-specific learning cues, better reflecting scoring rubrics. 
    more » « less